Sessionization –A Vital Stage in Data Preprocessing of Web Usage Mining-A Survey
نویسندگان
چکیده
The World Wide Web has impacted on almost ever aspects of our lives in modern era. The Web has many unique characteristics and which make mining useful information and knowledge a challenging task. Web mining uses many data mining techniques but it is not an application of traditional data mining due to heterogeneity and unstructured nature of the data on Web. Web mining tasks can be categorized into three types: Web Structure Mining,. Web Content Mining and Web Usage Mining. The goal of Web Usage Mining is to capture, model and analyze the behavioral patterns and profiles of users interacting with a Web Site. Web Usage Mining consists of many stages but this paper focuses on first stage i.e data preprocessing. Data Preprocessing consists of data cleaning, page view identification, sessionization, data integration and data transformation. This paper focuses on most complex part of data preprocessing and that is Sessionization.This paper covers many important aspects of sessionization stage which are very useful for research scholars who are doing research work in Web Usage mining field.
منابع مشابه
Traversal Pattern Mining in Web Usage Data
Web usage mining is to discover useful patterns in the web usage data, and the patterns provide useful information about the user’s browsing behavior. This chapter examines different types of web usage traversal patterns and the related techniques used to uncover them, including Association Rules, Sequential Patterns, Frequent Episodes, Maximal Frequent Forward Sequences, and Maximal Frequent S...
متن کاملA Novel Semantically-Time-Referrer based Approach of Web Usage Mining for Improved Sessionization in Pre-Processing of Web Log
Web usage mining(WUM) , also known as Web Log Mining is the application of Data Mining techniques, which are applied on large volume of data to extract useful and interesting user behaviour patterns from web logs, in order to improve web based applications. This paper aims to improve the data discovery by mining the usage data from log files. In this paper the work is done in three phases. Firs...
متن کاملA Novel Technique for Sessions Identification in Web Usage Mining Preprocessing
The growth of World Wide Web is incredible as it can be seen in present days. Users find it very difficult to extract useful and relevant information from the huge amount of information. The problems can be solved by Web Usage Mining which involves preprocessing, pattern discovery and pattern analysis. Preprocessing is an important process which converts raw web log data into transactions. Appl...
متن کاملA Survey on Preprocessing Methods for Web Usage Data
World Wide Web is a huge repository of web pages and links. It provides abundance of information for the Internet users. The growth of web is tremendous as approximately one million pages are added daily. Users’ accesses are recorded in web logs. Because of the tremendous usage of web, the web log files are growing at a faster rate and the size is becoming huge. Web data mining is the applicati...
متن کاملSemantic Preprocessing of Web Request Streams for Web Usage Mining
Efficient data preparation needs to discover the underlying knowledge from complicated Web usage data. In this paper, we have focused on two main tasks, semantic outlier detection from online Web request streams and segmentation (or sessionization) of them. We thereby exploit semantic technologies to infer the relationships among Web requests. Web ontologies such as taxonomies and directories c...
متن کامل